21 research outputs found
Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections
Many works have posited the benefit of depth in deep networks. However,
one of the problems encountered in the training of very deep networks is feature
reuse; that is, features are âdilutedâ as they are forward propagated through
the model. Hence, later network layers receive less informative signals about the
input data, consequently making training less effective. In this work, we address
the problem of feature reuse by taking inspiration from an earlier work which
employed residual learning for alleviating the problem of feature reuse. We propose
a modification of residual learning for training very deep networks to realize
improved generalization performance; for this, we allow stochastic shortcut connections
of identity mappings from the input to hidden layers.We perform extensive
experiments using the USPS and MNIST datasets. On the USPS dataset, we
achieve an error rate of 2.69% without employing any form of data augmentation
(or manipulation). On the MNIST dataset, we reach a comparable state-of-the-art
error rate of 0.52%. Particularly, these results are achieved without employing
any explicit regularization technique